<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<!--**
* Copyright 2005 The Apache Software Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
-->
<head>
  <title>InstantiatedIndex</title>
</head>
<body>
<p>InstantiatedIndex, alternative RAM store for small corpora.</p>

<p>WARNING: This contrib is experimental and the APIs may change without warning.</p>
<h2>Abstract</h2>

<p>
  Represented as a coupled graph of class instances, this
  all-in-memory index store implementation delivers search
  results up to a 100 times faster than the file-centric RAMDirectory
  at the cost of greater RAM consumption.
</p>

<h2>API</h2>

<p>
  Just as the default store implementation, InstantiatedIndex
  comes with an IndexReader and IndexWriter. The latter share
  many method signatures with the file-centric IndexWriter.
</p>

<p>
  It is also possible to load the content of another index
  by passing an IndexReader to the InstantiatedIndex constructor.
</p>

<h2>Performance</h2>

<p>
  At a few thousand ~160 characters long documents
  InstantiaedIndex outperforms RAMDirectory some 50x,
  15x at 100 documents of 2000 charachters length,
  and is linear to RAMDirectory at 10,000 documents of 2000 characters length.
</p>

<p>Mileage may vary depending on term saturation.</p>

<img src="doc-files/HitCollectionBench.jpg" alt="benchmark"/>

<p>
  Populated with a single document InstantiatedIndex is almost, but not quite, as fast as MemoryIndex.    
</p>

<p>
  It takes more or less the same time to populate an InstantiatedIndex
  as it takes to populate a RAMDirectory. Hardly any effort has been put
  in to optimizing the InstantiatedIndexWriter, only minimizing the amount
  of time needed to write-lock the index has been considered.
</p>

<h2>Caveats</h2>
<ul>
  <li>No locks! Consider using InstantiatedIndex as if it was immutable.</li>
  <li>No documents with fields containing readers.</li>
  <li>No field selection when retrieving documents, as all stored field are available in memory.</li>
  <li>Any document returned must cloned if they are to be touched.</li>
  <li>Norms array returned must not be touched.</li>
</ul>

<h2>Use cases</h2>

<p>
  Could replace any small index that could do with greater response time.
  spell check a priori index,
  the index of new documents exposed to user search agent queries,
  to compile classifiers in machine learning environments, et c.
</p>

<h2>Class diagram</h2>
<a href="doc-files/classdiagram.png"><img width="640px" height="480px" src="doc-files/classdiagram.png" alt="class diagram"></a>
<br/>
<a href="doc-files/classdiagram.uxf">Diagram</a> rendered using <a href="http://umlet.com">UMLet</a> 7.1.
</body>
</html>
